Methods of synthesizing polynucleotide via reverse translation from protein and oligonucleotide used in this method

ABSTRACT

As a molecular material for synthesizing a protein-encoding polynucleotide via reverse translation from the protein, a reverse translation-mediating oligonucleotide having the following properties (a) to (d) is provided: (a) containing a nucleotide sequence “nnn” encoding an amino acid residue “Xaa”; (b) the nucleotide sequence “nnn” being separable; (c) containing a nucleotide sequence binding to the amino acid residue “Xaa”; and (d) not containing any nucleotide sequence binding to an amino acid residue other than the amino acid residue “Xaa”. Also, a method of synthesizing a polynucleotide encoding the amino acid sequence of the primitive protein via a reverse translation reaction with the above oligonucleotide is provided.

TECHNICAL FIELD

The present invention relates to a method of synthesizing apolynucleotide encoding a protein of an unknown or known sequence viareverse translation, and an oligonucleotide used in this method.

BACKGROUND ART

Total genome sequences of various organism species are being clarified,and an analysis of the total gene expression condition in individualorganism is made possible. However, such an analysis remains at a levelof a gene transfer product (mRNA), and an analysis at a level of aprotein indispensable to an analysis of the functions of each gene isextremely insufficient.

As the method of analyzing a protein, a method is known usingtwo-dimensional electrophoresis, however, the number of proteins capableof being analyzed by this method is about several hundreds. Further, achip method using an antibody has been tried to develop, however, in thecase for example of human, several hundreds thousands of proteins areexpressed from the tens of thousands of genes, therefore, enormous laborand cost are necessary for producing antibodies to all of theseproteins.

On the other hand, as other analytical methods, one is widely conductedas PCR method in which a partial amino acid sequence of a protein isanalyzed, and an oligonucleotide synthesized based on the partial aminoacid sequence information is used as a primer, and another is a methodin which a genome DNA fragment and cDNA encoding a protein are obtainedby library screening using an oligonucleotide as a probe, and based oninformation on its base sequence and the total amino acid sequencesestimated from this, the function of the protein is estimated, forexample by homology with other known proteins. Further, by using thusobtained DNA fragment as a probe of a DNA chip (micro array), ananalysis of a manifestation mode of the protein is also made possible.

In the case of this method, the partial amino acid sequence of a proteinis determined by an Edman degradation method, a parallel massspectrometry method and the like, however, the kinds of amino acidresidues constituting a protein should be determined one by one,resultantly, large time and labor are necessary even for determiningseveral to decades of amino acid sequences. Further, a lot of processesand large cost and time are necessary also for synthesizing acorresponding oligonucleotide from thus determined amino acid sequence,to obtain the intended genome DNA and cDNA.

As described above, analyzing cyclopaedically a great number of proteinsis an essential problem to further advance the result of genome analysisand to make the information significant. For this, a means capable ofanalyzing a lot of proteins simply and correctly is essential.

The invention of the instant application has been accomplished in viewof the circumstances as described above, and an object thereof is toprovide a quite new method capable of synthesizing, directly from aprotein, a polynucleotide having the sequence encoding the protein, anda material for carrying out the method.

DISCLOSURE OF INVENTION

The invention of the instant application provides a reversetranslation-mediating oligonucleotide used for synthesizing aprotein-encoding polynucleotide via reverse translation from theprotein, together with a primer-oligonucleotide having 3 to 30nucleotides, wherein the reverse translation-mediating oligonucleotidehas the following properties (a) to (d):

-   -   (a) containing a nucleotide sequence “nnn” encoding an amino        acid residue “Xaa”;    -   (b) the nucleotide sequence “nnn” being separable;    -   (c) containing a nucleotide sequence binding to the amino acid        residue “Xaa”; and    -   (d) not containing any nucleotide sequence binding to an amino        acid residues other than the amino acid residue “Xaa”.

This reverse translation-mediating oligonucleotide further contains, ina preferable embodiment, a nucleotide sequence complementary to at leasttwo nucleotides sequence of the primer-oligonucleotide.

In another preferable embodiment, this reverse translation-mediatingoligonucleotide is a RNA sequence, and this RNA sequence is a ribozyme.In the case of this ribozyme RNA, it is preferable that the nucleotidesequence “nnn” is positioned at the 3′ end, and the nucleotide sequencecomplementary to at least two nucleotides sequence of theprimer-oligonucleotide is positioned at the 5′ end of a single strand.

The invention of the instant application also provides a set of thereverse translation-mediating oligonucleotides, which comprises at least20 kind of reverse translation-mediating oligonucleotides of any one ofclaims 1 to 5, where the 20 kinds of oligonucleotides carriesrespectively different nucleotide sequence “nnn” for different aminoacid residue “Xaa”.

Further, the invention of the instant application provides a method ofsynthesizing a protein-encoding polynucleotide via reverse translationfrom the protein, using the reverse translation-mediatingoligonucleotide of any one of claims 1 to 5, which method comprises thefollowing steps (1) to (6):

-   -   (1) connecting the 3′ terminus of a primer-oligonucleotide        having 3 to 30 nucleotides, to the C terminus of the protein,    -   (2) binding a nucleotide sequence of the reverse        translation-mediating oligonucleotide to an amino acid residue        “Xaa” positioned at the C terminus of the protein,    -   (3) connecting a nucleotide sequence “nnn” contained in the        reverse translation-mediating oligonucleotide to the 5′ terminus        of the primer-oligonucleotide,    -   (4) removing the amino acid residue “Xaa” from the protein, and        separating the nucleotide sequence “nnn” from the reverse        translation-mediating oligonucleotide,    -   (5) repeating the above-mentioned steps (1) to (4), and    -   (6) isolating a polynucleotide in which nucleotide sequences        encoding amino acid sequences of the protein are connected in        the correct order to the primer-oligonucleotide.

In preferable embodiments of the present method, aprimer-oligonucleotide having a nucleotide sequence encoding atermination codon at the 5′ end is used, the steps (1) to (5) arecontinuously conducted using the above-mentioned set of reversetranslation-mediating oligonucleotides, and a nucleotide sequence “nnn”is synthetically added to the reverse translation-mediatingoligonucleotide from which a nucleotide sequence “nnn” has beenseparated, in the step (4).

Still further, the invention of the instant application provides also apolypeptide which is an expression product of the polynucleotideproduced by any of the above-mentioned methods.

In the present invention, the term “oligonucleotide” means a DNAsequence or RNA sequence having 3 to 100 nucleotides (nt), and in thecase of a reverse translation-mediating oligonucleotide, it may have 101or more nt providing it has its properties. The term “polynucleotide”means a DNA sequence or RNA sequence having 101 or more nt, and alsothose having 100 or less nt in which an amino acid sequence of thesubject primitive protein or a part thereof is encoded. Theoligonucleotide and polynucleotide contain RNA (DNA) provided chemicalmodification such as 2′-O-methylation or phosphothioatization and thelike.

BRIEF DESCRIPTION OF DRAWINGS

FIG. 1 is a process chirt exemplifying a polynucleotide synthesisprocess of the present invention.

FIG. 2 is a view showing examples and results thereof of the presentinvention. In (d), lanes 1 and 3 represents RNA molecules beforereaction, and lanes 2 and 4 represents RNA molecules after reaction.

BEST MODES FOR CARRYING OUT THE INVENTION

The polynucleotide synthesis method of the present invention synthesizesa protein-encoding polynucleotide via reverse translation from theprotein, using a reverse translation-mediating oligonucleotide of thepresent invention and a primer-oligonucleotide.

The reverse translation-mediating oligonucleotide is characterized inthat it has the following properties (a) to (d):

-   -   (a) containing a nucleotide sequence “nnn” encoding anamino acid        residue “Xaa”;    -   (b) the nucleotide sequence “nnn” being separable;    -   (c) containing a nucleotide sequence binding to the amino acid        residue “Xaa”; and    -   (d) not containing any nucleotide sequence binding to amino acid        residues other than the amino acid residue “Xaa”.

The oligonucleotide can be a DNA sequence or RNA sequence having about20 to 100 nt, and RNA is superior in reactivity to DNA since RNA has a2′-OH group, and preferable as the olygonucleotide used in the presentinvention.

A DNA sequence and RNA sequence can be synthesized by ordinary methodsusing, respectively, a DNA polymerase and RNA polymerase. Alternatively,they can also be chemically synthesized using a DNA/RNA synthesizer andthe like.

In the property (a) of this reverse translation-mediatingoligonucleotide, the amino acid residue “Xaa” represents any of 20 aminoacid residues constituting a protein, and the nucleotide sequence “nnn”represents a codon sequence encoding the amino acid residue “Xaa”.Depending on the kind of the amino acid residue “Xaa”, this nucleotidesequence “nnn” is univocally determined in a certain residue, and when aplurality of candidates are possible as the nucleotide sequence “nnn”,any one kind of them can be adopted. In the latter case, it is alsopossible to determine an optimum nucleotide sequence referring to thecodon usage of an organic species from which the subject protein isderived, and the like.

In the property (b), the term “the nucleotide sequence “nnn” beingseparable” means that a nucleotide sequence “nnn” is separated byexcising from the oligonucleotide sequence. Such removal of a nucleotidesequence “nnn” may be advantageously conducted by, for example, a methodin which an appropriate RNA (DNA) restriction enzyme recognizingsequence is placed before and/or after a nucleotide sequence “nnn”, andthe restriction enzyme is allowed to act, to excise the nucleotidesequence “nnn”. Alternatively, it may also be permissible that thisoligonucleotide is constituted as a RNA ribozyme (for example, hammerhead ribozyme), and a nucleotide sequence “nnn” is positioned before orafter its self cut region. Still alternatively, it is also possible toadopt a hairpin ribozyme, HDV ribozyme, or ribozymes having a similaractivity to those that can be isolated by a SELEX method and the like.

In the property (c), “nucleotide sequence binding to the amino acidresidue Xaa” is a nucleotide sequence (domain) of 10 to 100 ntspecifically binding to given amino acid residue species. For example anarginine (Arg) binding domain and tryptophane (Trp) binding domain arewell known (Biochemistry 32, 5497-5502, 1993; J. Am. Chem. Soc. 114,3990-3991, 1992). Also binding domains against other 18 amino acidresidues can be produced by known methods such as, for example, a RNAselection method or SELEX method in a test tube (in vitro selectionmethod) (Nature 346, 818-822, 1990; Nature 344, 467-468, 1990; Science249. 505-510, 1990) and the like.

Such a reverse translation-mediating oligonucleotide may be a set ofreverse translation-mediating oligonucleotides, which comprises at least20 kinds of reverse translation-mediating oligonucleotides carryingrespectively different sequence “nnn” for different amino acid residues“Xaa”. By use of this set, it is made possible to synthesize apolynucleotide encoding the total amino acid sequences of a primitiveprotein by continuous reaction in a single test tube.

The primer-oligonucleotide used together with said reversetranslation-mediating oligonucleotide is a DNA sequence or RNA sequenceof 3 to 30 nt, and utilized for sequential connection with codonsequences encoding respective amino acid residues to its 5′ end afterformation of a complex with a protein. The nucleotide sequence may beany nucleotide sequence providing it is a sequence not showing a loopstructure, and when final production of a polypeptide by expression ofthe synthesized polynucleotide is taken into consideration, it ispreferable that the 5′ end of the primer-oligonucleotide is a nucleotidesequence coding a termination codon (UGA, UAG, TGA, TAG and the like).

The above-mentioned reverse translation-mediating oligonucleotide isbound to a protein via its specific amino acid residue binding sequence,and for rendering this bond stronger, it is preferable that the reversetranslation-mediating oligonucleotide has a nucleotide sequencecomplementary to at least two nucleotides sequence of theprimer-oligonucleotide. The reverse translation-mediatingoligonucleotide binds to a primer/protein complex by a hydrogen bondbetween complementary sequences with the primer-oligonucleotide alreadybound to the protein.

Next, the steps in the polynucleotide synthesis method of the presentinvention will be illustrated in detail referring to FIG. 1. FIG. 1 is aschematic view when a RNA sequence is used as an oligonucleotide, and inthis FIG. 1 and the following descriptions, the reversetranslation-mediating oligonucleotide is described as “rtRNA”, and theprimer-oligonucleotide is described as “Pre-mRNA”.

Step (1): FIG. 1(a)(b)

The 3′ end of Pre-mRNA is connected to the C terminus of the primitiveprotein to form protein/Pre-mRNA complex. In this example, Pre-mRNA hasthe anchor sequence (CCA) at the 3′ end, and has two termination codons(UGA, UAG) at the 5′ end. There is no particular restriction on theanchor sequence.

In this Pre-mRNA, the anchor sequence at its 3′ end is connected to theC terminus of the protein. Such a connection can be made by using anenzyme having an activity to allow poliovirus RNA to covalent-bonds toVPg protein (Cell 59, 511-519, 1989), or by a ribozyme having a similaractivity to such enzyme that may be selected by a SELEX method.

Step (2): FIG. 1(c)

rtRNA^(Arg) having a nucleotide sequence bound to an amino acid residueArg positioned at the C terminus of the protein is connected to theprotein. This rtRNA^(Arg) has the Arg codon (AGG) at the 3′ end and has,at the 5′ end, the complementary sequence (UCG) to the anchor sequence(CCA). Further, it has, in intermediate parts, an Arg binding sequence.The Arg binding sequence and the anchor complementary sequence allow thertRNA^(Arg) to binding with the protein/Pre-mRNA complex.

Step (3): FIG. 1(d)

The Arg codon (AGG) of rtRNA^(Arg) is allowed to shift to the 5′ end ofPre-mRNA. Such codon shift can be conducted by, for example, using ahammer head ribozyme as shown in the following example as rtRNA.Alternatively, it can be conducted also by methods using a proteinenzyme, such as RNA ligase or a ribozyme having a similar activity tosaid ribozyme, which may be isolated from SELEX method.

Step (4): FIG. 1(e)

The amino acid residue Arg at the C terminus of protein is removed fromthe protein, and the Arg codon connected to Pre-mRNA was separated. Forremoving the amino acid residue Arg from the protein, for example, onlyone amino acid residue at the C terminus may be removed using anappropriate peptidase. Alternatively, this step can be conducted also byusing a ribozyme having a similar activity to the peptidase, selectedfrom SELEX method. By removal of the amino acid residue Arg and rtRNAbound to this residue from the protein, the Arg codon connected to thePre-mRNA is also separated from rtRNA. In the case of use of, forexample, a ribozyme as shown in the example as rtRNA, separation of theArg codon can be conducted also by self-cut of a ribozyme at givenposition (before Arg codon) by allowing Mg²⁺ to act on this.Alternatively, a codon sequence can be separated also by using a hairpin ribozyme, HDV ribozyme, or selecting ribozymes having a similaractivity to those by a SELEX method.

Step (5): FIG. 1(f)(g)

The above-mentioned steps (1) to (4) are sequentially repeated. In thecase of the example shown in FIG. 1, the codon sequence (AUG) ofisoleucine (Ile) is connected to a primer-oligonucleotide by usingrtRNA^(Ile).

Step (6): FIG. 1(h)

Nucleotide sequences encoding amino acid sequences of the protein areconnected in the correct order to the primer-oligonucleotide, and thispolynucleotide is isolated.

The above-mentioned methods can be conducted not only in liquid phasebut under a condition in which a primitive protein is fixed to asubstrate and the like.

The above-mentioned method of the present invention can be conductedcontinuously by using a set of reverse translation-mediatingoligonucleotides provided by this invention. In this case, it is alsoeffective to allow rtRNA in which a codon sequence was separated in thestep (4) to regenerate the same codon sequence. Namely, rtRNA thusregenerated the codon sequence is used, when the corresponding aminoacid residue emerges in the subsequent reverse translation process, tomediate the reaction again. This leads to a fact that in anoligonucleotide set, it is theoretically sufficient to prepare each oneof 20 oligonucleotides corresponding to respective amino acid residues.Regeneration of a codon sequence can be conducted by using, for example,a ribozyme having an activity analogous to tRNA CCA lyase or RNAreplicase, or a ribozyme having a similar activity to those selected bySELEX method.

The polynucleotide synthesized by the above-mentioned method is composedof the nucleotide sequence encoding the total amino acid sequenceconstituting the primitive protein, and when the synthesizedpolynucleotide is a DNA sequence, a polypeptide composed of the sameamino acid sequence as that of the primitive protein can be readilyexpressed in an appropriate host-vector system. When the polynucleotideis RNA, the intended polypeptide can be made by expression of cDNA thatis synthesized in advance with a reverse transcriptase.

Further, analysis of gene expression using DNA array becomes possible byallowing several thousand of proteins in the total cell extractionliquid to reverse-translate simultaneously in one tube and by labelingthe resulted polynucleotide (in the case of RNA, cDNA synthesized byreverse transcription). Namely, in usual DNA array analysis, mRNA(actually, cDNA synthesized from mRNA) in cells is analyzed, however, byusing the polynucleotide synthesized in this method of the presentinvention as an analysis subject, the total proteins actually expressedin a cell can be analyzed. Therefore, analytical efficiency and accuracyincrease by far, as compared with conventional methods using a genetranscription product (mRNA) as a subject.

EXAMPLES

The invention of the instant application will be illustrated further indetail and specifically referring to the following examples, but thescope of the invention of the instant application is not limited by thefollowing examples.

As shown in FIG. 2(a), 83 nt of rtRNA^(Arg) (SEQ ID No. 1) and 8 nt ofPre-mRNA were made. The rtRNA^(Arg) is a hammer head ribozyme (Annu.Rev. Biochem. 61, 641-671, 1992) having an arginine binding domain andarginine codon (AGG) at the 3′ end. The rtRNA^(Arg) was synthesized froma template, double stranded DNA, containing T7 promoter by using T7RNApolymerase (Takara Shuzo Co., Ltd.) in the presence of [α-³²P] UTP(Amersham). The transcription reaction was conducted at 37° C. for 1hour in buffer solution containing 40 mM Tris-HCl (pH 8.0), 20 mM MgCl₂,and 5 mM DTT. The Pre-mRNA (8 nt) being complementary to the 5′ sequenceof the rtRNA^(Arg), was synthesized using a DNA/RNA synthesizer, then,the 5′ end was labeled with T4 polynucleotide kinase (Takara Shuzo Co.,Ltd.) in the presence of [γ-³²P] ATP (Amersham) in buffer solutioncontaining 50 mM Tris-HCl (pH 8.0), 10 mM MgCl₂, and 5 mM DTT, over aperiod of 30 minutes at 37° C. Prior to analysis, these RNAs weregel-purified.

Initially, these two RNA molecules were connected by a covalent bond viaa phospho-ester bond between the 3′ end of rtRNA^(Arg) and the 5′ end ofPre-mRNA (FIG. 2(b)). This connection was conducted using T4RNA ligase(Takara Shuzo Co., Ltd.), at 4° C. for 1 hour, in buffer solution (50 mMTris-HCl (pH 7.5), 10 mM MgCl₂, 10 mM DTT and 1 mM ATP). Then, thertRNA^(Arg) was allowed to self-separation in the same buffer solutionat 50° C. for 1 hour (FIG. 2(c)). The reaction product was analyzed in10% polyacrylamide-8 M urea gel, and the gel was treated byautoradiograph.

As a result, two new RNA molecules were obtained as shown in FIG. 2(d).Namely, they are 80 nt of rtRNA^(Arg) from which the arginine codonsequence had been deleted, and 11nt of Pre-mRNA to which its codonsequence had been added (SEQ ID No. 2).

From the results above, it was confirmed that mediation of anoligonucleotide molecule having an ability of connecting to an aminoacid residue “Xaa”, its codon sequence can be shifted to aprimer-oligonucleotide.

Industrial Applicability

According to the invention of the instant application, there is provideda quite new method in which from a protein of unknown or known sequence,a polynucleotide encoding its protein can be synthesized by a reversetranslation reaction, and a molecular material for this method. By thisinvention, protein analysis in post genome progresses greatly.

1. A reverse translation-mediating oligonucleotide used for synthesizinga protein-encoding polynucleotide via reverse translation from theprotein, together with a primer-oligonucleotide having 3 to 30nucleotides, wherein the reverse translation-mediating oligonucleotidehas the following properties (a) to (d): (a) containing a nucleotidesequence “nnn” encoding an amino acid residue “Xaa”; (b) the nucleotidesequence “nnn” being separable; (c) containing a nucleotide sequencebinding to the amino acid residue “Xaa”; and (d) not containing anynucleotide sequence binding to an amino acid residue other than theamino acid residue “Xaa”.
 2. The reverse translation-mediatingoligonucleotide of claim 1, further containing a nucleotide sequencecomplementary to at least two nucleotides sequence of theprimer-oligonucleotide.
 3. The reverse translation-mediatingoligonucleotide of claim 1, wherein the oligonucleotide is a RNAsequence.
 4. The reverse translation-mediating oligonucleotide of claim3, wherein the RNA sequence is a ribozyme.
 5. The reversetranslation-mediating oligonucleotide of claim 4, wherein the nucleotidesequence “nnn” is positioned at the 3′ end, and the nucleotide sequencecomplementary to at least two nucleotides sequence of theprimer-oligonucleotide is positioned at the 5′ end of the single strand.6. A set of the reverse translation-mediating oligonucleotides, whichcomprises at least 20 kind of reverse translation-mediatingoligonucleotides of claim 1, where the 20 kinds of oligonucleotidescarries respectively different nucleotide sequence “nnn” for differentamino acid residue “Xaa”.
 7. A method of synthesizing a protein-encodingpolynucleotide via reverse translation from the protein, using thereverse translation-mediating oligonucleotide of claim 1, which methodcomprises the following steps (1) to (6): (1) connecting the 3′ terminusof a primer-oligonucleotide having 3 to 30 nucleotides, to the Cterminus of the protein, (2) binding a nucleotide sequence of thereverse translation-mediating oligonucleotide to an amino acid residue“Xaa” positioned at the C terminus of the protein, (3) connecting anucleotide sequence “nnn” contained in the reverse translation-mediatingoligonucleotide to the 5′ terminus of the primer-oligonucleotide, (4)removing the amino acid residue “Xaa” from the protein, and separatingthe nucleotide sequence “nnn” from the reverse translation-mediatingoligonucleotide, (5) repeating the above-mentioned steps (1) to (4), and(6) isolating a polynucleotide in which nucleotide sequences encodingamino acid sequences of the protein are connected in the correct orderto the primer-oligonucleotide.
 8. The method according to claim 7, usinga primer-oligonucleotide having a nucleotide sequence encoding atermination codon at the 5′ end.
 9. The method according to claim 7 or8, wherein the steps (1) to (5) are continuously conducted using the setof reverse translation-mediating oligonucleotides of claim
 6. 10. Themethod according to claim 9, wherein a nucleotide sequence “nnn” issynthetically added to the reverse translation-mediating oligonucleotidefrom which a nucleotide sequence “nnn” has been separated, in the step(4).
 11. A polypeptide which is an expression product of thepolynucleotide produced by the method according to claim 7.